AITopics | preference level

Collaborating Authors

preference level

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

GFRIEND: Generative Few-shot Reward Inference through EfficieNt DPO

Zhao, Yiyang, Bai, Huiyu, Zhao, Xuejiao

arXiv.org Artificial IntelligenceJun-11-2025

The ability to train high-performing reward models with few-shot data is critical for enhancing the efficiency and scalability of Reinforcement Learning from Human Feedback (RLHF). We propose a data augmentation and expansion framework that enables generative reward models trained on small datasets to achieve comparable performance to those trained on large-scale datasets. Traditional methods to train a generative reward model, such as Direct Preference Optimization (DPO), are constrained by inefficiencies in sample pairing and limited data diversity. This work introduces preference refinement, which employs Chain-of-Thought (CoT) sampling to uncover diverse and high-quality preference relationships. It also incorporates a perplexity-based scoring mechanism to assign nuanced preference levels and utilizes Multi-level Direct Preference Optimization (M-DPO) to enable the model to capture finer-grained preference differences between samples. Experimental results demonstrate that the proposed method significantly enhances data efficiency and model performance, enabling reward models trained in a few-shot setting to achieve results on par with those trained on large-scale datasets. This study underscores the potential of data-efficient strategies in advancing reward model optimization, offering a robust solution for low-resource RLHF applications.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2506.08965

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Unravelling multi-agent ranked delegations

Colley, Rachael, Grandi, Umberto, Novaro, Arianna

arXiv.org Artificial IntelligenceNov-25-2021

We introduce a voting model with multi-agent ranked delegations. This model generalises liquid democracy in two aspects: first, an agent's delegation can use the votes of multiple other agents to determine their own -- for instance, an agent's vote may correspond to the majority outcome of the votes of a trusted group of agents; second, agents can submit a ranking over multiple delegations, so that a backup delegation can be used when their preferred delegations are involved in cycles. The main focus of this paper is the study of unravelling procedures that transform the delegation ballots received from the agents into a profile of direct votes, from which a winning alternative can then be determined by using a standard voting rule. We propose and study six such unravelling procedures, two based on optimisation and four using a greedy approach. We study both algorithmic and axiomatic properties, as well as related computational complexity problems of our unravelling procedures for different restrictions on the types of ballots that the agents can submit.

agent, delegation, procedure, (17 more...)

arXiv.org Artificial Intelligence

2111.13145

Country:

Europe > Germany (0.04)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(5 more...)

Genre: Research Report (0.63)

Industry:

Government > Voting & Elections (0.46)
Information Technology > Security & Privacy (0.45)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Decision Making Over Combinatorially-Structured Domains

Martin, Andrea (Tulane University) | Venable, K. Brent (Tulane University)

AAAI ConferencesFeb-8-2018

We consider a scenario where a user must make a set of correlated decisions and we propose a computational modeling of the deliberation process. We assume the user compactly expresses her preferences via soft constraints. We consider a sequential procedure that uses Decision Field Theory to model the decision making on each variable. We test this procedure on randomly generated tree-shaped Fuzzy Constraint Satisfaction Problems. Our preliminary results showed that the time increases almost in the number of nodes. This is promising in terms of modeling decision over exponentially large domains. In the future, we plan to compare our results non-sequential approach and with behavioral data to asses our approach both in terms of modeling human decision making over complex domains, and adopting DFT as a means of incorporating a form of uncertainty into the soft constraint formalism.

artificial intelligence, constraint, constraint-based reasoning, (17 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Genre: Research Report > New Finding (0.88)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)

Add feedback

Uncertainty in Soft Temporal Constraint Problems:A General Framework and Controllability Algorithms forThe Fuzzy Case

Rossi, F., Venable, K. B., Yorke-Smith, N.

arXiv.org Artificial IntelligenceOct-10-2011

In real-life temporal scenarios, uncertainty and preferences are often essential and coexisting aspects. We present a formalism where quantitative temporal constraints with both preferences and uncertainty can be defined. We show how three classical notions of controllability (that is, strong, weak, and dynamic), which have been developed for uncertain temporal problems, can be generalized to handle preferences as well. After defining this general framework, we focus on problems where preferences follow the fuzzy approach, and with properties that assure tractability. For such problems, we propose algorithms to check the presence of the controllability properties. In particular, we show that in such a setting dealing simultaneously with preferences and uncertainty does not increase the complexity of controllability testing. We also develop a dynamic execution algorithm, of polynomial complexity, that produces temporal plans under uncertainty that are optimal with respect to fuzzy preferences.

artificial intelligence, constraint, constraint-based reasoning, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.2135

1110.2212

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)

Add feedback